rank | frequency | n-gram |
---|---|---|
1 | 105056 | -e |
2 | 100885 | -o |
3 | 100816 | -a |
4 | 84730 | -i |
5 | 37033 | -s |
rank | frequency | n-gram |
---|---|---|
1 | 20964 | -no |
2 | 19762 | -ne |
3 | 18832 | -to |
4 | 17231 | -te |
5 | 16157 | -re |
rank | frequency | n-gram |
---|---|---|
1 | 12134 | -one |
2 | 9688 | -ano |
3 | 7366 | -ato |
4 | 7144 | -nte |
5 | 5988 | -ata |
rank | frequency | n-gram |
---|---|---|
1 | 8534 | -ione |
2 | 4348 | -ente |
3 | 3328 | -ento |
4 | 2878 | -ando |
5 | 2717 | -tore |
rank | frequency | n-gram |
---|---|---|
1 | 6623 | -zione |
2 | 2847 | -mento |
3 | 2527 | -mente |
4 | 1747 | -atore |
5 | 1652 | -arono |
The tables show the most frequent letter-N-grams at the ending of words for N=1…5. Everything runs in parallel to 2.2.5 Most frequent word beginnings. The aim is suffix detection instead of affix detection.
For N=3:
SELECT @pos:=(@pos+1), xx.* from (SELECT @pos:=0) r, (select count(*) as cnt ,concat("-", right(word,3)) FROM words WHERE w_id>100 group by right(word,3) order by cnt desc) xx limit 5;
2.2.5 Most frequent word beginnings